Goto

Collaborating Authors

 virtual space


Spatial Computing Communications for Multi-User Virtual Reality in Distributed Mobile Edge Computing Network

Xu, Caolu, Chen, Zhiyong, Tao, Meixia, Song, Li, Zhang, Wenjun

arXiv.org Artificial Intelligence

Abstract--Immersive virtual reality (VR) applications impose stringent requirements on latency, energy efficiency, and computational resources, particularly in multi-user interactive scenarios. T o address these challenges, we introduce the concept of spatial computing communications (SCC), a framework designed to meet the latency and energy demands of multi-user VR over distributed mobile edge computing (MEC) networks. SCC jointly represents the physical space, defined by users and base stations, and the virtual space, representing shared immersive environments, using a probabilistic model of user dynamics and resource requirements. The resource deployment task is then formulated as a multi-objective combinatorial optimization (MOCO) problem that simultaneously minimizes system latency and energy consumption across distributed MEC resources. T o solve this problem, we propose MO-CMPO, a multi-objective consistency model with policy optimization that integrates supervised learning and reinforcement learning (RL) fine-tuning guided by preference weights. Leveraging a sparse graph neural network (GNN), MO-CMPO efficiently generates Pareto-optimal solutions. Simulations with real-world New Radio base station datasets demonstrate that MO-CMPO achieves superior hypervolume performance and significantly lower inference latency than baseline methods. Furthermore, the analysis reveals practical deployment patterns: latency-oriented solutions favor local MEC execution to reduce transmission delay, while energy-oriented solutions minimize redundant placements to save energy.


My Virtual Avatar No Longer Looks Terrible in the Apple Vision Pro

WIRED

Remember Apple's Vision Pro? That's the 3,499 mixed reality headset the company launched early in 2024 that failed to garner much public interest. Apple has steamed ahead with updates for the platform over the past year, and soon there will be a new version upgrade: visionOS 26. I got a chance to try out a few of the new capabilities, but two stuck out to me more than the others. First is the upgrade to Personas. That's the spatial avatar the headset creates based on your likeness using the onboard cameras.


VoI-Driven Joint Optimization of Control and Communication in Vehicular Digital Twin Network

Lei, Lei, Zheng, Kan, Mei, Jie, Xuemin, null, Shen, null

arXiv.org Artificial Intelligence

The vision of sixth-generation (6G) wireless networks paves the way for the seamless integration of digital twins into vehicular networks, giving rise to a Vehicular Digital Twin Network (VDTN). The large amount of computing resources as well as the massive amount of spatial-temporal data in Digital Twin (DT) domain can be utilized to enhance the communication and control performance of Internet of Vehicle (IoV) systems. In this article, we first propose the architecture of VDTN, emphasizing key modules that center on functions related to the joint optimization of control and communication. We then delve into the intricacies of the multitimescale decision process inherent in joint optimization in VDTN, specifically investigating the dynamic interplay between control and communication. To facilitate the joint optimization, we define two Value of Information (VoI) concepts rooted in control performance. Subsequently, utilizing VoI as a bridge between control and communication, we introduce a novel joint optimization framework, which involves iterative processing of two Deep Reinforcement Learning (DRL) modules corresponding to control and communication to derive the optimal policy. Finally, we conduct simulations of the proposed framework applied to a platoon scenario to demonstrate its effectiveness in ensu


Artificial Spacetimes for Reactive Control of Resource-Limited Robots

Reinhardt, William H., Miskin, Marc Z.

arXiv.org Artificial Intelligence

Field-based reactive control provides a minimalist, decentralized route to guiding robots that lack onboard computation. Such schemes are well suited to resource-limited machines like microrobots, yet implementation artifacts, limited behaviors, and the frequent lack of formal guarantees blunt adoption. Here, we address these challenges with a new geometric approach called artificial spacetimes. We show that reactive robots navigating control fields obey the same dynamics as light rays in general relativity. This surprising connection allows us to adopt techniques from relativity and optics for constructing and analyzing control fields. When implemented, artificial spacetimes guide robots around structured environments, simultaneously avoiding boundaries and executing tasks like rallying or sorting, even when the field itself is static. We augment these capabilities with formal tools for analyzing what robots will do and provide experimental validation with silicon-based microrobots. Combined, this work provides a new framework for generating composed robot behaviors with minimal overhead.


Sony's XYN XR headset is being used in very different ways at CES 2025

Engadget

At CES last year, Sony teased an AR/VR headset prototype focused on "spatial content creation." And at the same time, Siemens announced it was working with Sony to use that same hardware, including the two new controllers it developed, for something it was calling the "industrial metaverse." That's a lot of buzzwords, but at CES 2025 both Siemens and Sony showed the headsets and associated software in action which helped clear up a lot of what the companies are trying to do here. During Sony's CES press conference, it announced its XYN brand of software and hardware solutions, with the headset being a key part of the equation. The XYN "spatial capture solution" uses mirrorless cameras to scan and make photorealistic 3D objects. Using the XYN headset, you can see those objects in 3D production software for animation, video games and other potential uses.


Toward a Predictive eXtended Reality Teleoperation System with Duo-Virtual Spaces

Zhang, Ziliang, Liu, Cong, Kim, Hyoseung

arXiv.org Artificial Intelligence

Extended Reality (XR) provides a more intuitive interaction method for teleoperating robots compared to traditional 2D controls. Recent studies have laid the groundwork for usable teleoperation with XR, but it fails in tasks requiring rapid motion and precise manipulations due to the large delay between user motion and agent feedback. In this work, we profile the end-to-end latency in a state-of-the-art XR teleoperation system and propose our idea to optimize the latency by implementing a duo-virtual spaces design and localizing the agent and objects in the user-side virtual space, while calibrating with periodic ground-truth poses from the agent-side virtual space.

  Genre: Research Report (0.84)

DVPE: Divided View Position Embedding for Multi-View 3D Object Detection

Wang, Jiasen, Li, Zhenglin, Sun, Ke, Liu, Xianyuan, Zhou, Yang

arXiv.org Artificial Intelligence

Sparse query-based paradigms have achieved significant success in multi-view 3D detection for autonomous vehicles. Current research faces challenges in balancing between enlarging receptive fields and reducing interference when aggregating multi-view features. Moreover, different poses of cameras present challenges in training global attention models. To address these problems, this paper proposes a divided view method, in which features are modeled globally via the visibility cross-attention mechanism, but interact only with partial features in a divided local virtual space. This effectively reduces interference from other irrelevant features and alleviates the training difficulties of the transformer by decoupling the position embedding from camera poses. Additionally, 2D historical RoI features are incorporated into the object-centric temporal modeling to utilize high-level visual semantic information. The model is trained using a one-to-many assignment strategy to facilitate stability. Our framework, named DVPE, achieves state-of-the-art performance (57.2% mAP and 64.5% NDS) on the nuScenes test set. Codes will be available at https://github.com/dop0/DVPE.


Layout Generation Agents with Large Language Models

Sasazawa, Yuichi, Sogawa, Yasuhiro

arXiv.org Artificial Intelligence

In recent years, there has been an increasing demand for customizable 3D virtual spaces. Due to the significant human effort required to create these virtual spaces, there is a need for efficiency in virtual space creation. While existing studies have proposed methods for automatically generating layouts such as floor plans and furniture arrangements, these methods only generate text indicating the layout structure based on user instructions, without utilizing the information obtained during the generation process. In this study, we propose an agent-driven layout generation system using the GPT-4V multimodal large language model and validate its effectiveness. Specifically, the language model manipulates agents to sequentially place objects in the virtual space, thus generating layouts that reflect user instructions. Experimental results confirm that our proposed method can generate virtual spaces reflecting user instructions with a high success rate. Additionally, we successfully identified elements contributing to the improvement in behavior generation performance through ablation study.


Apple Vision Pro two months later: A telepresence dream

Engadget

Two months after I started using the Apple Vision Pro, it hasn't transformed the way I live. It hasn't replaced my TV, and it doesn't make me want to give up my powerful desktop or slim laptops. It's just another tool in my gadget arsenal -- one I can don to catch up on X-Men '97 in bed, or to help me dive deep into research while I'm away from my office. The Vision Pro becomes normal so quickly, it's almost easy to forget how groundbreaking it actually is. Its screens are still absolutely stunning, and the combination of eye tracking and Apple's gesture controls makes for the most intuitive AR/VR interface I've seen yet.


Here are the most useful Apple Vision Pro apps at launch

Engadget

Although there are some big-name omissions (Netflix, YouTube and Spotify), the headset already supports over a million compatible App Store apps, Apple's first-party offerings and over 600 apps developed specifically for the "spatial computing" device. Here are the notable third-party Vision Pro apps you can install on day one. Microsoft didn't skimp on its entry into the Vision Pro era. Seven of the company's Office apps are available to install on launch day. These include Microsoft Teams, Word, Excel, PowerPoint, Outlook, OneNote and Loop.